147 research outputs found
Product-based Neural Networks for User Response Prediction
Predicting user responses, such as clicks and conversions, is of great
importance and has found its usage in many Web applications including
recommender systems, web search and online advertising. The data in those
applications is mostly categorical and contains multiple fields; a typical
representation is to transform it into a high-dimensional sparse binary feature
representation via one-hot encoding. Facing with the extreme sparsity,
traditional models may limit their capacity of mining shallow patterns from the
data, i.e. low-order feature combinations. Deep models like deep neural
networks, on the other hand, cannot be directly applied for the
high-dimensional input because of the huge feature space. In this paper, we
propose a Product-based Neural Networks (PNN) with an embedding layer to learn
a distributed representation of the categorical data, a product layer to
capture interactive patterns between inter-field categories, and further fully
connected layers to explore high-order feature interactions. Our experimental
results on two large-scale real-world ad click datasets demonstrate that PNNs
consistently outperform the state-of-the-art models on various metrics.Comment: 6 pages, 5 figures, ICDM201
Sampled in Pairs and Driven by Text: A New Graph Embedding Framework
In graphs with rich texts, incorporating textual information with structural
information would benefit constructing expressive graph embeddings. Among
various graph embedding models, random walk (RW)-based is one of the most
popular and successful groups. However, it is challenged by two issues when
applied on graphs with rich texts: (i) sampling efficiency: deriving from the
training objective of RW-based models (e.g., DeepWalk and node2vec), we show
that RW-based models are likely to generate large amounts of redundant training
samples due to three main drawbacks. (ii) text utilization: these models have
difficulty in dealing with zero-shot scenarios where graph embedding models
have to infer graph structures directly from texts. To solve these problems, we
propose a novel framework, namely Text-driven Graph Embedding with Pairs
Sampling (TGE-PS). TGE-PS uses Pairs Sampling (PS) to improve the sampling
strategy of RW, being able to reduce ~99% training samples while preserving
competitive performance. TGE-PS uses Text-driven Graph Embedding (TGE), an
inductive graph embedding approach, to generate node embeddings from texts.
Since each node contains rich texts, TGE is able to generate high-quality
embeddings and provide reasonable predictions on existence of links to unseen
nodes. We evaluate TGE-PS on several real-world datasets, and experiment
results demonstrate that TGE-PS produces state-of-the-art results on both
traditional and zero-shot link prediction tasks.Comment: Accepted by WWW 2019 (The World Wide Web Conference. ACM, 2019
- …